Goto

Collaborating Authors

 learning invariance


Learning Invariances in Neural Networks from Training Data

Neural Information Processing Systems

Invariances to translations have imbued convolutional neural networks with powerful generalization properties. However, we often do not know a priori what invariances are present in the data, or to what extent a model should be invariant to a given augmentation. We show how to learn invariances by parameterizing a distribution over augmentations and optimizing the training loss simultaneously with respect to the network parameters and augmentation parameters. With this simple procedure we can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations, on training data alone. We show our approach is competitive with methods that are specialized to each task with the appropriate hard-coded invariances, without providing any prior knowledge of which invariance is needed.


Learning Invariances using the Marginal Likelihood

Neural Information Processing Systems

In many supervised learning tasks, learning what changes do not affect the predic-tion target is as crucial to generalisation as learning what does. Data augmentationis a common way to enforce a model to exhibit an invariance: training data is modi-fied according to an invariance designed by a human and added to the training data.We argue that invariances should be incorporated the model structure, and learnedusing themarginal likelihood, which can correctly reward the reduced complexityof invariant models. We incorporate invariances in a Gaussian process, due to goodmarginal likelihood approximations being available for these models.


Understanding Learning Invariance in Deep Linear Networks

Duan, Hao, Montúfar, Guido

arXiv.org Machine Learning

Equivariant and invariant machine learning models exploit symmetries and structural patterns in data to improve sample efficiency. While empirical studies suggest that data-driven methods such as regularization and data augmentation can perform comparably to explicitly invariant models, theoretical insights remain scarce. In this paper, we provide a theoretical comparison of three approaches for achieving invariance: data augmentation, regularization, and hard-wiring. We focus on mean squared error regression with deep linear networks, which parametrize rank-bounded linear maps and can be hard-wired to be invariant to specific group actions. We show that the critical points of the optimization problems for hard-wiring and data augmentation are identical, consisting solely of saddles and the global optimum. By contrast, regularization introduces additional critical points, though they remain saddles except for the global optimum. Moreover, we demonstrate that the regularization path is continuous and converges to the hard-wired solution.


Learning Invariances in Neural Networks from Training Data

Neural Information Processing Systems

Invariances to translations have imbued convolutional neural networks with powerful generalization properties. However, we often do not know a priori what invariances are present in the data, or to what extent a model should be invariant to a given augmentation. We show how to learn invariances by parameterizing a distribution over augmentations and optimizing the training loss simultaneously with respect to the network parameters and augmentation parameters. With this simple procedure we can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations, on training data alone. We show our approach is competitive with methods that are specialized to each task with the appropriate hard-coded invariances, without providing any prior knowledge of which invariance is needed.


Review for NeurIPS paper: Learning Invariances in Neural Networks from Training Data

Neural Information Processing Systems

Weaknesses: I have two major concerns with this submission. First, the paper claims that prior work on data augmentation needs to know the invariance of interest a priori. However, this paper requires exactly the same thing, as the invariance of interest must be "expressable" by the learnable mapping. For instance, the authors prescribe a transformation that restrict themselves to translation, rotation, scaling and shearing invariances during training; at testing, rotation is precisely the nuisance transformation at play. Second, the proposed test-time data augmentation is a well known technique, also often used to learn equi/invariant classifiers.


Review for NeurIPS paper: Learning Invariances in Neural Networks from Training Data

Neural Information Processing Systems

Authors propose to learn a method for data-augmentation that improves performance as compared to data-augmentation strategy where all parameters are randomized. The results are not on datasets used by SOTA methods and some of them are in the appendix instead of the main paper. I agree with the authors that R2 might have misunderstood the paper and did not participate in the post-rebuttal discussion. It is also mentioned in his review, "The remaining contribution of this paper is the use of the re-parametrization trick to adapt the group over which we want to be invariant on, which is in my opinion not a substantial contribution to present this paper in NeurIPS." I don't think we should judge papers solely based on novelty in the technical section.


Learning Invariances in Neural Networks from Training Data

Neural Information Processing Systems

Invariances to translations have imbued convolutional neural networks with powerful generalization properties. However, we often do not know a priori what invariances are present in the data, or to what extent a model should be invariant to a given augmentation. We show how to learn invariances by parameterizing a distribution over augmentations and optimizing the training loss simultaneously with respect to the network parameters and augmentation parameters. With this simple procedure we can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations, on training data alone. We show our approach is competitive with methods that are specialized to each task with the appropriate hard-coded invariances, without providing any prior knowledge of which invariance is needed.


Reviews: Learning Invariances using the Marginal Likelihood

Neural Information Processing Systems

In this manuscript the authors present a scheme for ranking and refining invariance transformations within the Bayesian approach to supervised learning, thereby offering a fully Bayesian alternative to data augmentation. The implementation presented also introduces a scheme for unbiased estimation of the evidence lower bound for transformation representation models having a Normal likelihood (or writeable in an equivalent form as per the Polya-Gamma example). Moreover, the authors have been careful to ensure that their implementation is structured for efficiency within the contemporary framework for stochastic optimisation of sparse approximations to the GP (variational inference with mini-batching, etc.). Although limited by space, the examples presented are convincing of the potential of this approach. It is my view that this is a valuable and substantial contribution to the field, although I would be prepared to concede in relation to the NIPS reviewer guidelines that in some sense the progress is incremental rather than revolutionary.


Learning Invariances using the Marginal Likelihood

Wilk, Mark van der, Bauer, Matthias, John, ST, Hensman, James

Neural Information Processing Systems

In many supervised learning tasks, learning what changes do not affect the predic-tion target is as crucial to generalisation as learning what does. Data augmentationis a common way to enforce a model to exhibit an invariance: training data is modi-fied according to an invariance designed by a human and added to the training data.We argue that invariances should be incorporated the model structure, and learnedusing themarginal likelihood, which can correctly reward the reduced complexityof invariant models. We incorporate invariances in a Gaussian process, due to goodmarginal likelihood approximations being available for these models. Papers published at the Neural Information Processing Systems Conference.